VIDEO STORY SEGMENTATION IN TRECVID 2004 Winston
نویسندگان
چکیده
In this technical report, we give an overview our technical developments in the story segmentation task in TRECVID 2004. Among them, we propose an information-theoretic framework, visual cue cluster construction (VC), to automatically discover adequate mid-level features. The problem is posed as mutual information maximization, through which optimal cue clusters are discovered to preserve the highest information about the semantic labels. We extend the Information Bottleneck framework to high-dimensional continuous features and further propose a projection method to map each video into probabilistic memberships over all the cue clusters. The biggest advantage of the proposed approach is to remove the dependence on the manual process in choosing the mid-level features and the huge labor cost involved in annotating the training corpus for training the detector of each mid-level feature. When tested in TRECVID 2004 news video story segmentation, the proposed approach achieves promising performance gain over representations derived from conventional clustering techniques and even the mid-level features selected manually; meanwhile, it achieved one of the top performances, F1=0.65, close to the highest performance, F1=0.69, by other groups. We also experiment with other promising visual features and continue investigating effective prosody features. The introduction of post-processing also provides practical improvements. Furthermore, the fusion from other modalities, such as speech prosody features and ASR-based segmentation scores are significant and have been confirmed again in this experiment.
منابع مشابه
Discovery and fusion of salient multimodal features toward news story segmentation
In this paper, we present our new results in news video story segmentation and classification in the context of TRECVID video retrieval benchmarking event 2003. We applied and extended the Maximum Entropy statistical model to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text. We have included various features such as motion, face, music/spe...
متن کاملColumbia - Ibm News Video Story Segmentation in Trecvid 2004
In this technical report, we give an overview of our technical developments in the story segmentation task in TRECVID 2004. Among them, we propose an information-theoretic framework, visual cue cluster construction (VC), to automatically discover adequate mid-level features. The problem is posed as mutual information maximization, through which optimal cue clusters are discovered to preserve th...
متن کاملDiscovery and Fusion of Salient Multi-modal Features towards News Story Segmentation
In this paper, we present our new results in news video story segmentation and classification in the context of TRECVID video retrieval benchmarking event 2003. We applied and extended the Maximum Entropy statistical model to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text. We have included various features such as motion, face, music/spe...
متن کاملShot Boundary Determination on MPEG Compressed Domain and Story Segmentation Experiments for TRECVID 2003
KDDI R&D Laboratories has been participating in the past TREC conferences for text retrieval tasks. In this year we are newly participating in TRECVID 2003, namely the shot boundary determination and story segmentation tasks. In shot boundary determination task, we applied our proprietary shot segmentation algorithm originally proposed in [1] and slightly upgraded for this task. In our methods,...
متن کاملUniversity of Central Florida at TRECVID 2004
This year, the Computer Vision Group at University of Central Florida participated in two tasks in TRECVID 2004: High-Level Feature Extraction and Story Segmentation. For feature extraction task, we have developed the detection methods for “Madeleine Albright”, “Bill Clinton”, “Beach”, “Basketball Scored” and “People Walking/Running”. We used the adaboost technique, and has employed the speech ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005